Web Service

Web Service
This month we're going to be looking at web servers. The internet is one of the areas where Unix still reigns suprieme, as it was designed from the start to opperate over a network. So it should come as no surprise that the most popular web server in use today runs on Unix. According to the Netcraft Survey there are almost 700,000 Apache web servers, thats 43% of all web servers on the internet. Best of all Apache is free, the source code can be found on AFCD23 or you can get the lastest version from the Apache ftp site.
Now I know what you're thinking, "What good is a web server to me, I don't have a permanent internet connection". That doesn't matter, infact if you were thinking of running a full time web server I wouldn't recommend you use your amiga. A PC running NetBSD and Apache can do the job just as well, leaving your amiga free for bigger and better things!
The most common reason for wanting to install a personal web server is to create an environment where you can develop web sites before putting them onto a "live" server. Simple sites that only contain flat html pages do not require the use of a web server, however if you wish to include cgi applications on your site a development server is essential.

CGI is the Common Gateway Interface, and in simple terms is a mechanism for running programs on your web server and displaying the output as an html page. CGI applications range from simple random image scripts to complex database storage and retrieval applications. One of the nice things about CGI is it allows you to write you applications in almost any language you choose, however most cgi scripts are written in perl. Perl is not included as part of NetBSD, but it is available free, and again we've provided the source code on AFCD23 (If you don't have the CD to hand you can also find the source code on the perl ftp site. Perls popularity in CGI programming is partly because it is a scripted language. This means that like the shell scripts we were looking at last month, you can simply type in a perl script and run it without the need for any compilation. Perl also includes built in features found in Unix including the sed and awk text manipulation commands. This makes it ideal for processing text documents such as html pages.

Once you have compiled Apache and Perl you will need to configure Apache to suit your system. Details of how to do this are included with the source code. We`ve included a special Amiga Format configuration file httpd.conf. As you would expect the configuration file tells Apache how to run. Apache has many configurable options, such as which port to run on (usually port 80), what server name to use, and the location of the html files. Apache also has the ability to run virtual servers, that is allow one Apache binary to act as if it were many different web servers. In our configuration file we`ve defined a main web server, dev.amigasoc.org, which is where a lot of the cgi development for the uk.amigasoc.org website takes palce, and a virtual server, www.wibble.wobble just so you can be sure that i`m not cheating by showing you screengrabs of real web servers!
Finally before you can start Apache you will need to make sure all the directories mentioned in httpd.conf really exist. The easiest way to do this is to unpack websites.tar into the /usr/local/etc/httpd/htdocs directory. This will create two subdirectories "amigasoc" and "wibble" containing all the files needed for the rest of this tutorial. Once you are sure that everything is correct you can start the Apache web server. The Apache binary should be installed in /usr/local/etc/httpd/bin/, however you can change this path by editing the httpd.h file before you complie Apache.

If you enter the URL http://localhost or http://dev.amigasoc.org into a web browser running on your machine you should be presented with the AmigaSoc front page, whilst the URL http://www.wibble.wobble will present you with a Wibble test page. If you reload the Wibble home page a number of times you will notice that the background and the picture on the page change everytime. This is done using a perl cgi script, shown in listing 1. The HTML source code for the page contains references to the script instead of to an image file. Everytime you load this page Apache runs the script and acts upon the output. In this case our script returns the path to an image file, so Apache loads that image onto the web page. The script itself is quite simple. As with last months shell scripts the first line tells NetBSD which interpreter to use to run the script, which is in this case the path you`ve installed the perl binary in. The script then reads the contents of an image directory into an array called "@images" using a combination of the perl "split" and unix "ls" commands.

Listing 1 - random.pl


#!/usr/local/bin/perl


@images = split("\n",`/bin/ls -d ../backgrounds/*.gif`);


srand(time ^ $$);

$num = rand(@images);



print "Location: $images[$num]\n\n";

The next 2 lines in the script deal with choosing the image. The first line of the pair initialiases the random number generator using the current process id (contained in the $$ variable). If this line was ommited the random number chosen by the rand function in the next line would always be the same! Finally the script has to tell the web server which image to use in the document. The perl script can not simply print the name of the image as the web server will not understand what this means, so it is necessary to print an HTTP header as the start of any output from a cgi script. The HTTP header contains information required by the web server to deal with the rest of the scripts output. In the case of our random image script the HTTP header is simply Location:, however if we are using cgi to generate entire web pages we would use the HTTP header, Content-type:text/html\n\n.

Whilst creating pages that have random elements such as images on them may look nice, it`s not the most usefull thing in the world to use cgi for. A more useful application of cgi is to produce web pages based upon some input from a user. Many of the pages on the AmigaSoc website are generated "on the fly" based either on the response to various forms on our pages that require input, or on other constantly changing data, like our classifieds section. The next example takes the output from a form and performs various tasks on it, this script forms the basic engine for many AmigaSoc pages including the feedback and helpdesk forms.

The most important job of any form processing script is to store the data somewhere for future reference. To do this however we must first extract the data from the form. Fortunatly this job is made easy by a set of public domain perl routines. They are all included in the cgi-lib file which must be in your cgi-bin directory for these scripts to work. Cgi-lib takes the data input into the form and presents it in an associative array. An associative array is a list of data where each data element is refered to by a name instead of a number. cgi-lib ensures that the name needed to reference the data is the same name you gave that particular input field on your html form. This may sound horrendously complicated but infact it makes programming beautifully simple as long as you give a little thought to the layout of your html forms. Looking at the AmigaSoc feedback form you can see the second input box is for the users email address. In the html source code for the page this input box is given the label "address". When cgi-lib processes the data contained in the form the value in this field is placed into the associative array element "address". To access the data you simply refer to the element "address" of the array. Cgi-lib always calls the associative array "in".



$email_address = $in{address};

Listing 2 - swear.pl

@rude = ("Ben","Vost","Nick","Veitch");
$a = 0;
while ($a < @rude) {
        if ($in{message} =~ /$rude[$a]/i) {
                &no_message;
        }
        $a++;
}

Now that cgi-lib has done the hard work, all that is left for us to do with our perl script is generate an http header, and an appropriate html page. There are three different versions of the feedback script, feedback.pl, included on the CD. The first version simply prints the contents of the feedback form back to the user. The second version prints the data back, but also includes the AmigaSoc logo and background, whilst the third version emails the form data to the AmigaSoc webmaster and generates a confirmation html page personally addressed to the user who entered the data.

Sometimes it may be desirable to perform some kind of checks on the data before processing it any further. Listing 2 shows a pattern matching function used within the AmigaSoc graffiti wall and classifieds scripts to stop people posting obscene messages. It checks the message entered for a set of banned words and will generate an error message if anyone tries to include these words in their message. If the words are not present then the message is acceptable. For obvious reasons we can't print these words in the magazine so i've replaced them with a selection of just as nasty, but printable "4 letter words".

A final trick used by cgi is the use of environment variables to obtain certain information about the person reading the page. Usually these take the form of a personalised greeeting, however as web browsers seem to be coming more and more incompatible use of the HTTP_USER_AGENT variable is becoming more common to tell surfers to upgrade or change their browser. Table 1 shows some of the common environment variables supported by most web browsers.

Web Server Environment Variables
SERVER_NAME SERVER_SOFTWARE REMOTE_HOST REMOTE_ADDR REMOTE_USER HTTP_USER_AGENT HTTP_REFERER	the servers hostname or I.P address name and version of the server software the hostname of the clients machine the I.P address of the clients machine the authenticated name of the user the browser the client is using the URL of the last page visited

In the Final Article we'll be taking a look at Java. Currently the only way you can run Java on your Amiga is through Unix.